Skip to content

Conversation

@Elvis339
Copy link
Contributor

@Elvis339 Elvis339 commented Jan 28, 2026

Why this should be merged

This patch is closing #1639 it adds automated benchmark triggers to catch performance regressions without manual intervention.

  • Daily (40M→41M): Quick 1M block test for fast regression detection
  • Weekly (50M→60M): Comprehensive 10M block stress test
  • Runner: avago-runner-i4i-2xlarge-local-ssd for consistent NVMe performance

Note: should be merged after #1493

How this works

  1. JSON config (.github/benchmark-schedules.json): Defines benchmark parameters per schedule
  2. resolve-config job: Uses actions/github-script to match cron → config
  3. benchmark job: Matrix strategy triggers AvalancheGo's workflow, publishes to GitHub Pages

Manual dispatch still works via workflow inputs.

How this was tested

  • Verified S3 paths exist
  • Confirmed storage (~443 GB) fits i4i.4xlarge capacity (3.75 TB)
  • Manual workflow dispatch validation (pending)

@Elvis339 Elvis339 self-assigned this Jan 28, 2026
@Elvis339 Elvis339 added the DO NOT MERGE This PR is not meant to be merged in its current state label Jan 28, 2026
Copy link
Contributor

@RodrigoVillar RodrigoVillar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you please rebase this PR on top of #1642 (or whatever PR it's dependent on)? I'm not what's needed to be reviewed vs code that's being reviewed in a different PR.

Comment on lines 153 to 150
summary-always: true
auto-push: true
fail-on-alert: true
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

q: what's the rationale for setting summary-always and fail-on-alert to true?

Copy link
Contributor Author

@Elvis339 Elvis339 Jan 28, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Both are about visibility and catching regressions early. summary-always ensures results are always visible in the job summary. fail-on-alert ensures we don't silently regress - if performance drops significantly, the workflow fails rather than quietly recording bad data.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if performance drops significantly

Hmm what does it mean for performance to drop significantly?

Copy link
Member

@rkuris rkuris left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: Overall, this seems like a lot of somewhat-fragile code to parse json instead of just specifying the parameters in the normal github style.

I think we can save a ton of money by picking a much smaller instance size. I think i4i.xlarge will work at roughly 1/8 the cost.

{
"name": "daily-40m-41m",
"cron": "0 5 * * *",
"runner": "avago-runner-i4i-4xlarge-local-ssd",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This instance type is likely wasting a ton of space and is double the cost of i4i-2xlarge and 4x the cost of i4i-xlarge. The latter has 937G of local disk. Let's lower the cost of this by using smaller instances.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Infra team opened Jira ticket for this, we can track it here: https://ava-labs.atlassian.net/browse/INFRA-6420

description: 'Firewood commit/branch/tag to test (leave empty to use the commit that triggered the workflow)'
default: ''
libevm:
description: 'libevm commit/branch/tag to test (leave empty to skip)'
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What does 'skip' mean here? Does that mean use main/master?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Skip means using libevm defined in AvalancheGo, I updated description.

default: ''
config:
description: 'Config (e.g., firewood, hashdb)'
default: ''
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be 'firewood'? What happens if none is specified in the json?

Why do we even need a default here? Same with many of the others. Is it required?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

default: '' isn't strictly required omitting it has the same behavior (empty string when not provided). We use it for consistency with AvalancheGo's pattern.

Yes you're right in this context default value for config should be Firewood.

const fs = require('fs');
let matrix;

if (context.eventName === 'schedule') {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this true when fired from the scheduled cron job and false otherwise?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, context.eventName returns the trigger type:

  • schedule when fired by cron
  • workflow_dispatch when triggered manually
  • push, pull_request, etc. for other triggers

Implements workflow to trigger C-Chain reexecution benchmarks in AvalancheGo
and track Firewood performance over time. Supports task-based and custom
parameter modes. Results stored in GitHub Pages via github-action-benchmark.

# Conflicts:
#	.github/workflows/track-performance.yml
- Refactor `bench-cchain` to trigger track-performance.yml instead of
  directly calling AvalancheGo's workflow
- Add input validation and helpful error messages for test/custom params
- Use commit SHA instead of branch name for reproducibility
- Fix AvalancheGo workflow input: use `with-dependencies` format
  ("firewood=abc,libevm=xyz") instead of separate firewood-ref/libevm-ref
- Remove status-cchain, list-cchain, help-cchain (use GitHub UI instead)
- Remove debug logging from track-performance.yml
Remove local developer tooling (justfile recipe, flake.nix, METRICS.md)
to reduce PR scope. These will be submitted in a follow-up PR after
the CI workflow changes are merged.
Daily (40M→41M) and weekly (50M→60M) benchmarks with JSON-based
config and matrix strategy for cleaner workflow management.
@Elvis339 Elvis339 force-pushed the es/scheduled-perf-tracking branch from 2af94f6 to 6ee9ccc Compare January 28, 2026 17:55
- Replace `avago-runner-i4i-4xlarge-local-ssd` with `avago-runner-i4i-2xlarge-local-ssd` for daily and weekly benchmarks.
- Add new input options for `firewood` and `libevm` commits/branches/tags.
- Extend runner choices with additional configurations in the workflow.
- Remove `.github/benchmark-schedules.json` in favor of inline configurations.
- Replace matrix-based strategy with direct output handling for cleaner and more explicit workflow logic.
- Preserve manual and scheduled benchmark support with optimized input handling.
@Elvis339
Copy link
Contributor Author

Nit: Overall, this seems like a lot of somewhat-fragile code to parse json instead of just specifying the parameters in the normal github style.

Updated was trying to stay consistent with AvalancheGo's JSON config approach but I agree with you inline with native GitHub outputs feels more natural here. No strong need for consistency on this at this point in time.

@Elvis339 Elvis339 added DO NOT MERGE This PR is not meant to be merged in its current state and removed DO NOT MERGE This PR is not meant to be merged in its current state labels Jan 28, 2026
@Elvis339
Copy link
Contributor Author

Elvis339 commented Jan 28, 2026

@rkuris while testing the scheduled benchmark workflow, I uncovered a couple of issues:

  1. v0.1.0 has a broken Nix build
    The flake.nix at that tag tries to read workspaceCargoToml.workspace.package.version, but that field doesn't exist in our Cargo.toml. This was already fixed on main (reads from ffiCargoToml.package.version instead), but the tag is frozen with the bug.

When it's time to cut the next release, this will naturally resolve. Alternatively, we could cut a quick v0.1.1 patch to get performance trends.

Can the team prepare a v0.1.1 patch or, if we're fine waiting for the next release cycle - performance tracking won't work otherwise due to Nix build dependency failure in v0.1.0.

@Elvis339
Copy link
Contributor Author

Elvis339 commented Jan 29, 2026

Test case manual dispatch with custom dependencies

Command:

gh workflow run track-performance.yml \
  --ref es/scheduled-perf-tracking \
  -f test=firewood-101-250k \
  -f runner=avago-runner-i4i-2xlarge-local-ssd \
  -f firewood=v0.0.18 \
  -f avalanchego=firewood-benchmark-base

Run: https://github.com/ava-labs/firewood/actions/runs/21489879261

  • Predefined test: firewood-101-250k
  • Runner: avago-runner-i4i-2xlarge-local-ssd
  • Firewood: v0.0.18
  • AvalancheGo: firewood-benchmark-base branch

- Add workflow_dispatch trigger to gh-pages.yaml for manual runs
- Trigger gh-pages deployment from track-performance.yml after
  benchmark results are pushed to benchmark-data branch

This ensures benchmark results are immediately visible on GitHub Pages
without waiting for a push to main.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

c-chain DO NOT MERGE This PR is not meant to be merged in its current state performance

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants